Dataset statistics
| Number of variables | 14 |
|---|---|
| Number of observations | 5570 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.0 MiB |
| Average record size in memory | 189.9 B |
Variable types
| Numeric | 13 |
|---|---|
| Categorical | 1 |
Município has a high cardinality: 5570 distinct values | High cardinality |
CV_HEPatite_B is highly correlated with CV_BCG and 10 other fields | High correlation |
CV_HIB is highly correlated with CV_BCG and 10 other fields | High correlation |
CV_DPT is highly correlated with CV_BCG and 10 other fields | High correlation |
CV_Polio is highly correlated with CV_BCG and 10 other fields | High correlation |
CV_rota is highly correlated with CV_HEPatite_B and 9 other fields | High correlation |
CV_Pneumo is highly correlated with CV_BCG and 10 other fields | High correlation |
CV_MNCC is highly correlated with CV_HEPatite_B and 9 other fields | High correlation |
CV_SCR1 is highly correlated with CV_BCG and 10 other fields | High correlation |
CV_SCR2 is highly correlated with CV_BCG and 10 other fields | High correlation |
CV_VARICELA is highly correlated with CV_BCG and 10 other fields | High correlation |
CV_HEPATITE_A is highly correlated with CV_BCG and 10 other fields | High correlation |
CV_BCG is highly correlated with CV_HEPatite_B and 8 other fields | High correlation |
Município is uniformly distributed | Uniform |
COD has unique values | Unique |
Município has unique values | Unique |
CV_BCG has 431 (7.7%) zeros | Zeros |
CV_SCR2 has 136 (2.4%) zeros | Zeros |
Reproduction
| Analysis started | 2022-11-09 00:44:44.842151 |
|---|---|
| Analysis finished | 2022-11-09 00:45:07.378372 |
| Duration | 22.54 seconds |
| Software version | pandas-profiling v3.4.0 |
| Download configuration | config.json |
| Distinct | 5570 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 325358.6278 |
| Minimum | 110001 |
|---|---|
| Maximum | 530010 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 110001 |
|---|---|
| 5-th percentile | 150777.25 |
| Q1 | 251212.5 |
| median | 314627.5 |
| Q3 | 411918.75 |
| 95-th percentile | 510729.55 |
| Maximum | 530010 |
| Range | 420009 |
| Interquartile range (IQR) | 160706.25 |
Descriptive statistics
| Standard deviation | 98491.03388 |
|---|---|
| Coefficient of variation (CV) | 0.3027152977 |
| Kurtosis | -0.5258091553 |
| Mean | 325358.6278 |
| Median Absolute Deviation (MAD) | 74152.5 |
| Skewness | 0.1213411839 |
| Sum | 1812247557 |
| Variance | 9700483754 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 110001 | 1 | < 0.1% |
| 353970 | 1 | < 0.1% |
| 354040 | 1 | < 0.1% |
| 354030 | 1 | < 0.1% |
| 354025 | 1 | < 0.1% |
| 354020 | 1 | < 0.1% |
| 354010 | 1 | < 0.1% |
| 354000 | 1 | < 0.1% |
| 353990 | 1 | < 0.1% |
| 353980 | 1 | < 0.1% |
| Other values (5560) | 5560 |
| Value | Count | Frequency (%) |
| 110001 | 1 | |
| 110002 | 1 | |
| 110003 | 1 | |
| 110004 | 1 | |
| 110005 | 1 | |
| 110006 | 1 | |
| 110007 | 1 | |
| 110008 | 1 | |
| 110009 | 1 | |
| 110010 | 1 |
| Value | Count | Frequency (%) |
| 530010 | 1 | |
| 522230 | 1 | |
| 522220 | 1 | |
| 522205 | 1 | |
| 522200 | 1 | |
| 522190 | 1 | |
| 522185 | 1 | |
| 522180 | 1 | |
| 522170 | 1 | |
| 522160 | 1 |
| Distinct | 5570 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 467.3 KiB |
| 110001 Alta Floresta D'Oeste | 1 |
|---|---|
| 353970 Platina | 1 |
| 354040 Populina | 1 |
| 354030 Pontes Gestal | 1 |
| 354025 Pontalinda | 1 |
| Other values (5565) |
Length
| Max length | 39 |
|---|---|
| Median length | 34 |
| Mean length | 18.61059246 |
| Min length | 10 |
Characters and Unicode
| Total characters | 103661 |
|---|---|
| Distinct characters | 80 |
| Distinct categories | 6 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 5570 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | 110001 Alta Floresta D'Oeste |
|---|---|
| 2nd row | 110002 Ariquemes |
| 3rd row | 110003 Cabixi |
| 4th row | 110004 Cacoal |
| 5th row | 110005 Cerejeiras |
Common Values
| Value | Count | Frequency (%) |
| 110001 Alta Floresta D'Oeste | 1 | < 0.1% |
| 353970 Platina | 1 | < 0.1% |
| 354040 Populina | 1 | < 0.1% |
| 354030 Pontes Gestal | 1 | < 0.1% |
| 354025 Pontalinda | 1 | < 0.1% |
| 354020 Pontal | 1 | < 0.1% |
| 354010 Pongaí | 1 | < 0.1% |
| 354000 Pompéia | 1 | < 0.1% |
| 353990 Poloni | 1 | < 0.1% |
| 353980 Poá | 1 | < 0.1% |
| Other values (5560) | 5560 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| do | 756 | 4.8% |
| são | 364 | 2.3% |
| de | 302 | 1.9% |
| santa | 161 | 1.0% |
| da | 143 | 0.9% |
| nova | 135 | 0.9% |
| sul | 115 | 0.7% |
| rio | 94 | 0.6% |
| dos | 73 | 0.5% |
| josé | 70 | 0.4% |
| Other values (9533) | 13640 |
Most occurring characters
| Value | Count | Frequency (%) |
| 10283 | 9.9% | |
| a | 8791 | 8.5% |
| 0 | 8160 | 7.9% |
| o | 5961 | 5.8% |
| 1 | 4774 | 4.6% |
| 2 | 4591 | 4.4% |
| r | 4532 | 4.4% |
| i | 4388 | 4.2% |
| 3 | 4106 | 4.0% |
| e | 3764 | 3.6% |
| Other values (70) | 44311 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 50872 | |
| Decimal Number | 33420 | |
| Space Separator | 10283 | 9.9% |
| Uppercase Letter | 9010 | 8.7% |
| Other Punctuation | 47 | < 0.1% |
| Dash Punctuation | 29 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 8791 | |
| o | 5961 | |
| r | 4532 | |
| i | 4388 | |
| e | 3764 | 7.4% |
| n | 3196 | 6.3% |
| d | 2553 | 5.0% |
| s | 2423 | 4.8% |
| t | 2293 | 4.5% |
| u | 2155 | 4.2% |
| Other values (27) | 10816 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 1137 | |
| C | 970 | |
| P | 911 | 10.1% |
| M | 721 | 8.0% |
| A | 698 | 7.7% |
| B | 602 | 6.7% |
| I | 475 | 5.3% |
| J | 405 | 4.5% |
| G | 391 | 4.3% |
| R | 367 | 4.1% |
| Other values (20) | 2333 |
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 8160 | |
| 1 | 4774 | |
| 2 | 4591 | |
| 3 | 4106 | |
| 5 | 3654 | |
| 4 | 2781 | 8.3% |
| 7 | 1470 | 4.4% |
| 6 | 1422 | 4.3% |
| 9 | 1382 | 4.1% |
| 8 | 1080 | 3.2% |
Space Separator
| Value | Count | Frequency (%) |
| 10283 |
Other Punctuation
| Value | Count | Frequency (%) |
| ' | 47 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 29 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 59882 | |
| Common | 43779 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 8791 | |
| o | 5961 | 10.0% |
| r | 4532 | 7.6% |
| i | 4388 | 7.3% |
| e | 3764 | 6.3% |
| n | 3196 | 5.3% |
| d | 2553 | 4.3% |
| s | 2423 | 4.0% |
| t | 2293 | 3.8% |
| u | 2155 | 3.6% |
| Other values (57) | 19826 |
Common
| Value | Count | Frequency (%) |
| 10283 | ||
| 0 | 8160 | |
| 1 | 4774 | |
| 2 | 4591 | |
| 3 | 4106 | 9.4% |
| 5 | 3654 | 8.3% |
| 4 | 2781 | 6.4% |
| 7 | 1470 | 3.4% |
| 6 | 1422 | 3.2% |
| 9 | 1382 | 3.2% |
| Other values (3) | 1156 | 2.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 100822 | |
| None | 2839 | 2.7% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 10283 | 10.2% | |
| a | 8791 | 8.7% |
| 0 | 8160 | 8.1% |
| o | 5961 | 5.9% |
| 1 | 4774 | 4.7% |
| 2 | 4591 | 4.6% |
| r | 4532 | 4.5% |
| i | 4388 | 4.4% |
| 3 | 4106 | 4.1% |
| e | 3764 | 3.7% |
| Other values (54) | 41472 |
None
| Value | Count | Frequency (%) |
| ã | 794 | |
| á | 393 | |
| í | 336 | |
| é | 317 | 11.2% |
| ç | 268 | 9.4% |
| ó | 243 | 8.6% |
| â | 161 | 5.7% |
| ú | 101 | 3.6% |
| ô | 71 | 2.5% |
| ê | 70 | 2.5% |
| Other values (6) | 85 | 3.0% |
| Distinct | 3579 |
|---|---|
| Distinct (%) | 64.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 51.90465171 |
| Minimum | 0 |
|---|---|
| Maximum | 594.72 |
| Zeros | 431 |
| Zeros (%) | 7.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 15.56 |
| median | 48.72 |
| Q3 | 80.6025 |
| 95-th percentile | 116.226 |
| Maximum | 594.72 |
| Range | 594.72 |
| Interquartile range (IQR) | 65.0425 |
Descriptive statistics
| Standard deviation | 42.85430973 |
|---|---|
| Coefficient of variation (CV) | 0.825635243 |
| Kurtosis | 12.01724139 |
| Mean | 51.90465171 |
| Median Absolute Deviation (MAD) | 32.525 |
| Skewness | 1.781430416 |
| Sum | 289108.91 |
| Variance | 1836.491862 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 431 | 7.7% |
| 100 | 32 | 0.6% |
| 50 | 23 | 0.4% |
| 20 | 14 | 0.3% |
| 75 | 13 | 0.2% |
| 16.67 | 13 | 0.2% |
| 66.67 | 12 | 0.2% |
| 60 | 11 | 0.2% |
| 6.67 | 11 | 0.2% |
| 18.75 | 11 | 0.2% |
| Other values (3569) | 4999 |
| Value | Count | Frequency (%) |
| 0 | 431 | |
| 0.11 | 1 | < 0.1% |
| 0.13 | 1 | < 0.1% |
| 0.21 | 1 | < 0.1% |
| 0.27 | 1 | < 0.1% |
| 0.31 | 1 | < 0.1% |
| 0.33 | 1 | < 0.1% |
| 0.39 | 2 | < 0.1% |
| 0.41 | 1 | < 0.1% |
| 0.47 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 594.72 | 1 | |
| 500.19 | 1 | |
| 473.92 | 1 | |
| 437.57 | 1 | |
| 396.37 | 1 | |
| 379.97 | 1 | |
| 340.54 | 1 | |
| 310 | 1 | |
| 291.8 | 1 | |
| 285.71 | 1 |
| Distinct | 3329 |
|---|---|
| Distinct (%) | 59.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 83.40500718 |
| Minimum | 0 |
|---|---|
| Maximum | 430.77 |
| Zeros | 5 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 40.3 |
| Q1 | 70.43 |
| median | 84.33 |
| Q3 | 97.04 |
| 95-th percentile | 120 |
| Maximum | 430.77 |
| Range | 430.77 |
| Interquartile range (IQR) | 26.61 |
Descriptive statistics
| Standard deviation | 25.00849407 |
|---|---|
| Coefficient of variation (CV) | 0.2998440371 |
| Kurtosis | 10.12256374 |
| Mean | 83.40500718 |
| Median Absolute Deviation (MAD) | 13.23 |
| Skewness | 0.7267847412 |
| Sum | 464565.89 |
| Variance | 625.4247757 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 90 | 1.6% |
| 83.33 | 26 | 0.5% |
| 66.67 | 20 | 0.4% |
| 75 | 18 | 0.3% |
| 80 | 18 | 0.3% |
| 85.71 | 18 | 0.3% |
| 90 | 18 | 0.3% |
| 88.89 | 17 | 0.3% |
| 71.43 | 16 | 0.3% |
| 77.78 | 15 | 0.3% |
| Other values (3319) | 5314 |
| Value | Count | Frequency (%) |
| 0 | 5 | |
| 0.26 | 1 | < 0.1% |
| 3.48 | 1 | < 0.1% |
| 3.51 | 1 | < 0.1% |
| 3.75 | 1 | < 0.1% |
| 4.76 | 1 | < 0.1% |
| 5.36 | 1 | < 0.1% |
| 6.12 | 1 | < 0.1% |
| 6.21 | 1 | < 0.1% |
| 6.67 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 430.77 | 1 | |
| 330 | 1 | |
| 250 | 1 | |
| 231.58 | 1 | |
| 222.57 | 1 | |
| 221.43 | 1 | |
| 216 | 1 | |
| 200 | 1 | |
| 197.73 | 1 | |
| 190.91 | 1 |
| Distinct | 3322 |
|---|---|
| Distinct (%) | 59.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 83.42763914 |
| Minimum | 0 |
|---|---|
| Maximum | 423.08 |
| Zeros | 5 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 40.3045 |
| Q1 | 70.4725 |
| median | 84.38 |
| Q3 | 97.1075 |
| 95-th percentile | 120 |
| Maximum | 423.08 |
| Range | 423.08 |
| Interquartile range (IQR) | 26.635 |
Descriptive statistics
| Standard deviation | 25.00687727 |
|---|---|
| Coefficient of variation (CV) | 0.2997433169 |
| Kurtosis | 9.551395468 |
| Mean | 83.42763914 |
| Median Absolute Deviation (MAD) | 13.265 |
| Skewness | 0.6938169252 |
| Sum | 464691.95 |
| Variance | 625.3439109 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 93 | 1.7% |
| 83.33 | 24 | 0.4% |
| 88.89 | 18 | 0.3% |
| 90 | 18 | 0.3% |
| 80 | 18 | 0.3% |
| 75 | 17 | 0.3% |
| 66.67 | 17 | 0.3% |
| 85.71 | 16 | 0.3% |
| 71.43 | 15 | 0.3% |
| 87.5 | 14 | 0.3% |
| Other values (3312) | 5320 |
| Value | Count | Frequency (%) |
| 0 | 5 | |
| 0.26 | 1 | < 0.1% |
| 3.48 | 1 | < 0.1% |
| 3.51 | 1 | < 0.1% |
| 3.75 | 1 | < 0.1% |
| 4.76 | 1 | < 0.1% |
| 5.36 | 1 | < 0.1% |
| 5.92 | 1 | < 0.1% |
| 6.12 | 1 | < 0.1% |
| 6.67 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 423.08 | 1 | |
| 330 | 1 | |
| 250 | 1 | |
| 231.58 | 1 | |
| 221.63 | 1 | |
| 221.43 | 1 | |
| 216 | 1 | |
| 200 | 1 | |
| 197.73 | 1 | |
| 190.91 | 1 |
| Distinct | 3305 |
|---|---|
| Distinct (%) | 59.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 83.54537882 |
| Minimum | 0 |
|---|---|
| Maximum | 423.08 |
| Zeros | 5 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 40.419 |
| Q1 | 70.59 |
| median | 84.48 |
| Q3 | 97.2275 |
| 95-th percentile | 120 |
| Maximum | 423.08 |
| Range | 423.08 |
| Interquartile range (IQR) | 26.6375 |
Descriptive statistics
| Standard deviation | 24.9990235 |
|---|---|
| Coefficient of variation (CV) | 0.2992268855 |
| Kurtosis | 9.563700409 |
| Mean | 83.54537882 |
| Median Absolute Deviation (MAD) | 13.19 |
| Skewness | 0.6967920267 |
| Sum | 465347.76 |
| Variance | 624.951176 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 93 | 1.7% |
| 83.33 | 27 | 0.5% |
| 80 | 22 | 0.4% |
| 90 | 18 | 0.3% |
| 88.89 | 18 | 0.3% |
| 75 | 17 | 0.3% |
| 71.43 | 15 | 0.3% |
| 85.71 | 14 | 0.3% |
| 66.67 | 14 | 0.3% |
| 87.5 | 13 | 0.2% |
| Other values (3295) | 5319 |
| Value | Count | Frequency (%) |
| 0 | 5 | |
| 0.26 | 1 | < 0.1% |
| 3.48 | 1 | < 0.1% |
| 3.51 | 1 | < 0.1% |
| 3.75 | 1 | < 0.1% |
| 4.76 | 1 | < 0.1% |
| 5.36 | 1 | < 0.1% |
| 5.92 | 1 | < 0.1% |
| 6.12 | 1 | < 0.1% |
| 6.67 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 423.08 | 1 | |
| 330 | 1 | |
| 250 | 1 | |
| 231.58 | 1 | |
| 221.63 | 1 | |
| 221.43 | 1 | |
| 216 | 1 | |
| 200 | 1 | |
| 197.73 | 1 | |
| 190.91 | 1 |
| Distinct | 3312 |
|---|---|
| Distinct (%) | 59.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 82.48775224 |
| Minimum | 0 |
|---|---|
| Maximum | 384.62 |
| Zeros | 5 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 40.366 |
| Q1 | 69.27 |
| median | 83.465 |
| Q3 | 95.9975 |
| 95-th percentile | 118.4525 |
| Maximum | 384.62 |
| Range | 384.62 |
| Interquartile range (IQR) | 26.7275 |
Descriptive statistics
| Standard deviation | 24.68308295 |
|---|---|
| Coefficient of variation (CV) | 0.2992333077 |
| Kurtosis | 7.808414651 |
| Mean | 82.48775224 |
| Median Absolute Deviation (MAD) | 13.18 |
| Skewness | 0.5924374076 |
| Sum | 459456.78 |
| Variance | 609.2545837 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 89 | 1.6% |
| 80 | 22 | 0.4% |
| 75 | 21 | 0.4% |
| 66.67 | 19 | 0.3% |
| 83.33 | 18 | 0.3% |
| 85.71 | 18 | 0.3% |
| 88.89 | 18 | 0.3% |
| 90 | 16 | 0.3% |
| 50 | 14 | 0.3% |
| 91.67 | 14 | 0.3% |
| Other values (3302) | 5321 |
| Value | Count | Frequency (%) |
| 0 | 5 | |
| 0.26 | 1 | < 0.1% |
| 1.62 | 1 | < 0.1% |
| 1.75 | 1 | < 0.1% |
| 3.48 | 1 | < 0.1% |
| 3.75 | 1 | < 0.1% |
| 4.08 | 1 | < 0.1% |
| 5.92 | 1 | < 0.1% |
| 5.95 | 1 | < 0.1% |
| 6.67 | 3 |
| Value | Count | Frequency (%) |
| 384.62 | 1 | |
| 330 | 1 | |
| 250 | 1 | |
| 234.21 | 1 | |
| 228.57 | 1 | |
| 224 | 1 | |
| 219.44 | 1 | |
| 200 | 1 | |
| 197.73 | 1 | |
| 190.91 | 1 |
| Distinct | 3355 |
|---|---|
| Distinct (%) | 60.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 82.20124417 |
| Minimum | 0 |
|---|---|
| Maximum | 370 |
| Zeros | 8 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 39.573 |
| Q1 | 68.54 |
| median | 83.235 |
| Q3 | 96 |
| 95-th percentile | 120 |
| Maximum | 370 |
| Range | 370 |
| Interquartile range (IQR) | 27.46 |
Descriptive statistics
| Standard deviation | 25.17800552 |
|---|---|
| Coefficient of variation (CV) | 0.3062971342 |
| Kurtosis | 7.573239546 |
| Mean | 82.20124417 |
| Median Absolute Deviation (MAD) | 13.735 |
| Skewness | 0.5994734931 |
| Sum | 457860.93 |
| Variance | 633.9319618 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 84 | 1.5% |
| 75 | 24 | 0.4% |
| 66.67 | 17 | 0.3% |
| 50 | 17 | 0.3% |
| 87.5 | 16 | 0.3% |
| 83.33 | 16 | 0.3% |
| 80 | 16 | 0.3% |
| 120 | 15 | 0.3% |
| 71.43 | 15 | 0.3% |
| 88.89 | 14 | 0.3% |
| Other values (3345) | 5336 |
| Value | Count | Frequency (%) |
| 0 | 8 | |
| 0.72 | 1 | < 0.1% |
| 0.87 | 1 | < 0.1% |
| 1.75 | 1 | < 0.1% |
| 3.85 | 1 | < 0.1% |
| 4 | 1 | < 0.1% |
| 5.26 | 1 | < 0.1% |
| 5.56 | 1 | < 0.1% |
| 5.77 | 1 | < 0.1% |
| 5.95 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 370 | 1 | |
| 353.85 | 1 | |
| 271.43 | 1 | |
| 268 | 1 | |
| 228.57 | 1 | |
| 207.69 | 1 | |
| 200 | 2 | |
| 192.54 | 1 | |
| 185.71 | 1 | |
| 181.82 | 1 |
| Distinct | 3363 |
|---|---|
| Distinct (%) | 60.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 85.16659066 |
| Minimum | 0 |
|---|---|
| Maximum | 440 |
| Zeros | 7 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 43.149 |
| Q1 | 71.865 |
| median | 86.11 |
| Q3 | 98.78 |
| 95-th percentile | 123.08 |
| Maximum | 440 |
| Range | 440 |
| Interquartile range (IQR) | 26.915 |
Descriptive statistics
| Standard deviation | 25.3644533 |
|---|---|
| Coefficient of variation (CV) | 0.2978216353 |
| Kurtosis | 11.55940878 |
| Mean | 85.16659066 |
| Median Absolute Deviation (MAD) | 13.45 |
| Skewness | 0.8200680335 |
| Sum | 474377.91 |
| Variance | 643.3554913 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 82 | 1.5% |
| 83.33 | 23 | 0.4% |
| 75 | 23 | 0.4% |
| 66.67 | 21 | 0.4% |
| 80 | 15 | 0.3% |
| 90 | 15 | 0.3% |
| 90.48 | 15 | 0.3% |
| 88.89 | 14 | 0.3% |
| 120 | 13 | 0.2% |
| 90.91 | 12 | 0.2% |
| Other values (3353) | 5337 |
| Value | Count | Frequency (%) |
| 0 | 7 | |
| 1.08 | 1 | < 0.1% |
| 1.74 | 1 | < 0.1% |
| 2.67 | 1 | < 0.1% |
| 3.51 | 1 | < 0.1% |
| 3.75 | 1 | < 0.1% |
| 5.95 | 1 | < 0.1% |
| 6.67 | 2 | < 0.1% |
| 6.94 | 1 | < 0.1% |
| 7.14 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 440 | 1 | |
| 369.23 | 1 | |
| 260 | 1 | |
| 257.14 | 1 | |
| 228.57 | 1 | |
| 208.33 | 1 | |
| 207.69 | 1 | |
| 205.97 | 1 | |
| 200 | 2 | |
| 190.91 | 1 |
| Distinct | 3343 |
|---|---|
| Distinct (%) | 60.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 82.74824776 |
| Minimum | 0 |
|---|---|
| Maximum | 400 |
| Zeros | 9 |
| Zeros (%) | 0.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 40.427 |
| Q1 | 70 |
| median | 83.61 |
| Q3 | 96.24 |
| 95-th percentile | 118.9915 |
| Maximum | 400 |
| Range | 400 |
| Interquartile range (IQR) | 26.24 |
Descriptive statistics
| Standard deviation | 24.67232558 |
|---|---|
| Coefficient of variation (CV) | 0.2981613056 |
| Kurtosis | 10.24818455 |
| Mean | 82.74824776 |
| Median Absolute Deviation (MAD) | 13.085 |
| Skewness | 0.6982630431 |
| Sum | 460907.74 |
| Variance | 608.7236498 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 97 | 1.7% |
| 80 | 25 | 0.4% |
| 75 | 20 | 0.4% |
| 88.89 | 20 | 0.4% |
| 66.67 | 20 | 0.4% |
| 83.33 | 19 | 0.3% |
| 50 | 14 | 0.3% |
| 81.25 | 14 | 0.3% |
| 71.43 | 14 | 0.3% |
| 81.82 | 13 | 0.2% |
| Other values (3333) | 5314 |
| Value | Count | Frequency (%) |
| 0 | 9 | |
| 2.67 | 1 | < 0.1% |
| 2.86 | 1 | < 0.1% |
| 3.51 | 1 | < 0.1% |
| 3.75 | 1 | < 0.1% |
| 5 | 1 | < 0.1% |
| 5.41 | 1 | < 0.1% |
| 6.67 | 1 | < 0.1% |
| 7.14 | 2 | < 0.1% |
| 7.41 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 400 | 1 | |
| 376.92 | 1 | |
| 272 | 1 | |
| 218.18 | 1 | |
| 214.29 | 1 | |
| 207.69 | 1 | |
| 200 | 1 | |
| 181.25 | 1 | |
| 179.1 | 1 | |
| 177.14 | 1 |
| Distinct | 3402 |
|---|---|
| Distinct (%) | 61.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 88.06631239 |
| Minimum | 0 |
|---|---|
| Maximum | 430 |
| Zeros | 8 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 41.8545 |
| Q1 | 73.2175 |
| median | 88.995 |
| Q3 | 102.5 |
| 95-th percentile | 128.891 |
| Maximum | 430 |
| Range | 430 |
| Interquartile range (IQR) | 29.2825 |
Descriptive statistics
| Standard deviation | 27.73196394 |
|---|---|
| Coefficient of variation (CV) | 0.314898662 |
| Kurtosis | 8.860177843 |
| Mean | 88.06631239 |
| Median Absolute Deviation (MAD) | 14.455 |
| Skewness | 0.876399964 |
| Sum | 490529.36 |
| Variance | 769.0618239 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 119 | 2.1% |
| 75 | 21 | 0.4% |
| 85.71 | 18 | 0.3% |
| 80 | 17 | 0.3% |
| 83.33 | 16 | 0.3% |
| 71.43 | 14 | 0.3% |
| 116.67 | 14 | 0.3% |
| 133.33 | 12 | 0.2% |
| 125 | 11 | 0.2% |
| 110 | 11 | 0.2% |
| Other values (3392) | 5317 |
| Value | Count | Frequency (%) |
| 0 | 8 | |
| 2.5 | 1 | < 0.1% |
| 3.11 | 1 | < 0.1% |
| 3.57 | 1 | < 0.1% |
| 4 | 2 | < 0.1% |
| 4.05 | 1 | < 0.1% |
| 4.48 | 1 | < 0.1% |
| 6.02 | 1 | < 0.1% |
| 6.14 | 1 | < 0.1% |
| 6.25 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 430 | 1 | |
| 369.23 | 1 | |
| 288.46 | 1 | |
| 285.71 | 1 | |
| 252.17 | 1 | |
| 252 | 1 | |
| 250 | 1 | |
| 232.76 | 1 | |
| 229.63 | 1 | |
| 228.57 | 1 |
| Distinct | 3642 |
|---|---|
| Distinct (%) | 65.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 59.94277199 |
| Minimum | 0 |
|---|---|
| Maximum | 410 |
| Zeros | 136 |
| Zeros (%) | 2.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 6.67 |
| Q1 | 37.3525 |
| median | 60.98 |
| Q3 | 82.25 |
| 95-th percentile | 109.09 |
| Maximum | 410 |
| Range | 410 |
| Interquartile range (IQR) | 44.8975 |
Descriptive statistics
| Standard deviation | 31.76216873 |
|---|---|
| Coefficient of variation (CV) | 0.5298748735 |
| Kurtosis | 2.88765018 |
| Mean | 59.94277199 |
| Median Absolute Deviation (MAD) | 22.35 |
| Skewness | 0.4097075555 |
| Sum | 333881.24 |
| Variance | 1008.835362 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 136 | 2.4% |
| 100 | 59 | 1.1% |
| 66.67 | 24 | 0.4% |
| 50 | 21 | 0.4% |
| 83.33 | 18 | 0.3% |
| 75 | 18 | 0.3% |
| 33.33 | 15 | 0.3% |
| 84.62 | 14 | 0.3% |
| 25 | 14 | 0.3% |
| 37.5 | 13 | 0.2% |
| Other values (3632) | 5238 |
| Value | Count | Frequency (%) |
| 0 | 136 | |
| 0.21 | 1 | < 0.1% |
| 0.26 | 1 | < 0.1% |
| 0.29 | 1 | < 0.1% |
| 0.43 | 1 | < 0.1% |
| 0.53 | 1 | < 0.1% |
| 0.58 | 1 | < 0.1% |
| 0.61 | 1 | < 0.1% |
| 0.7 | 1 | < 0.1% |
| 0.71 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 410 | 1 | |
| 292.31 | 1 | |
| 233.33 | 1 | |
| 228 | 1 | |
| 200 | 1 | |
| 182.61 | 1 | |
| 180 | 1 | |
| 161.54 | 1 | |
| 158.82 | 1 | |
| 157.14 | 1 |
| Distinct | 3545 |
|---|---|
| Distinct (%) | 63.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 80.09573429 |
| Minimum | 0 |
|---|---|
| Maximum | 370 |
| Zeros | 14 |
| Zeros (%) | 0.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 29.707 |
| Q1 | 61.915 |
| median | 81.25 |
| Q3 | 96.6775 |
| 95-th percentile | 125 |
| Maximum | 370 |
| Range | 370 |
| Interquartile range (IQR) | 34.7625 |
Descriptive statistics
| Standard deviation | 30.17910747 |
|---|---|
| Coefficient of variation (CV) | 0.3767879493 |
| Kurtosis | 3.885998434 |
| Mean | 80.09573429 |
| Median Absolute Deviation (MAD) | 17 |
| Skewness | 0.6246051422 |
| Sum | 446133.24 |
| Variance | 910.7785276 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 66 | 1.2% |
| 75 | 21 | 0.4% |
| 80 | 20 | 0.4% |
| 66.67 | 17 | 0.3% |
| 87.5 | 16 | 0.3% |
| 83.33 | 15 | 0.3% |
| 120 | 15 | 0.3% |
| 0 | 14 | 0.3% |
| 88.89 | 14 | 0.3% |
| 50 | 13 | 0.2% |
| Other values (3535) | 5359 |
| Value | Count | Frequency (%) |
| 0 | 14 | |
| 0.3 | 1 | < 0.1% |
| 0.91 | 1 | < 0.1% |
| 1.3 | 1 | < 0.1% |
| 2.04 | 1 | < 0.1% |
| 2.42 | 1 | < 0.1% |
| 2.78 | 1 | < 0.1% |
| 2.86 | 1 | < 0.1% |
| 3.51 | 1 | < 0.1% |
| 3.88 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 370 | 1 | |
| 284.62 | 1 | |
| 264.29 | 1 | |
| 254.55 | 1 | |
| 242.42 | 1 | |
| 233.33 | 1 | |
| 231.58 | 1 | |
| 228.26 | 1 | |
| 226.47 | 1 | |
| 222.89 | 1 |
| Distinct | 3370 |
|---|---|
| Distinct (%) | 60.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 79.59285637 |
| Minimum | 0 |
|---|---|
| Maximum | 470 |
| Zeros | 24 |
| Zeros (%) | 0.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 43.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 31.3725 |
| Q1 | 64.715 |
| median | 81.48 |
| Q3 | 95.6875 |
| 95-th percentile | 119.05 |
| Maximum | 470 |
| Range | 470 |
| Interquartile range (IQR) | 30.9725 |
Descriptive statistics
| Standard deviation | 27.24634795 |
|---|---|
| Coefficient of variation (CV) | 0.3423215247 |
| Kurtosis | 11.15375329 |
| Mean | 79.59285637 |
| Median Absolute Deviation (MAD) | 15.28 |
| Skewness | 0.6491802866 |
| Sum | 443332.21 |
| Variance | 742.3634766 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 97 | 1.7% |
| 75 | 29 | 0.5% |
| 50 | 24 | 0.4% |
| 0 | 24 | 0.4% |
| 66.67 | 21 | 0.4% |
| 85.71 | 19 | 0.3% |
| 80 | 19 | 0.3% |
| 83.33 | 16 | 0.3% |
| 81.82 | 13 | 0.2% |
| 111.11 | 13 | 0.2% |
| Other values (3360) | 5295 |
| Value | Count | Frequency (%) |
| 0 | 24 | |
| 0.25 | 1 | < 0.1% |
| 0.36 | 1 | < 0.1% |
| 1.11 | 1 | < 0.1% |
| 1.24 | 1 | < 0.1% |
| 1.61 | 1 | < 0.1% |
| 1.8 | 1 | < 0.1% |
| 1.85 | 1 | < 0.1% |
| 1.92 | 1 | < 0.1% |
| 2.38 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 470 | 1 | |
| 376.92 | 1 | |
| 271.43 | 1 | |
| 240 | 1 | |
| 211.11 | 1 | |
| 200 | 1 | |
| 192.86 | 1 | |
| 190 | 1 | |
| 188.89 | 1 | |
| 183.33 | 1 |
Auto
The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| COD | Município | CV_BCG | CV_HEPatite_B | CV_HIB | CV_DPT | CV_Polio | CV_rota | CV_Pneumo | CV_MNCC | CV_SCR1 | CV_SCR2 | CV_VARICELA | CV_HEPATITE_A | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 110001 | 110001 Alta Floresta D'Oeste | 67.57 | 93.69 | 93.69 | 93.99 | 91.29 | 93.69 | 97.00 | 97.00 | 115.11 | 61.63 | 88.22 | 91.24 |
| 1 | 110002 | 110002 Ariquemes | 108.69 | 83.21 | 83.28 | 83.28 | 83.68 | 85.00 | 89.05 | 88.32 | 87.17 | 59.49 | 78.88 | 80.82 |
| 2 | 110003 | 110003 Cabixi | 0.00 | 92.75 | 92.75 | 92.75 | 94.20 | 101.45 | 101.45 | 89.86 | 98.51 | 94.03 | 97.01 | 97.01 |
| 3 | 110004 | 110004 Cacoal | 108.98 | 88.45 | 89.28 | 89.28 | 89.13 | 90.04 | 91.85 | 92.75 | 135.84 | 27.32 | 82.19 | 82.04 |
| 4 | 110005 | 110005 Cerejeiras | 63.94 | 92.19 | 92.94 | 92.94 | 94.05 | 95.54 | 98.14 | 95.91 | 103.37 | 83.90 | 93.63 | 92.88 |
| 5 | 110006 | 110006 Colorado do Oeste | 19.21 | 89.66 | 89.66 | 89.66 | 89.16 | 93.60 | 97.04 | 93.10 | 100.00 | 84.73 | 99.51 | 102.46 |
| 6 | 110007 | 110007 Corumbiara | 14.02 | 80.37 | 80.37 | 80.37 | 86.92 | 92.52 | 95.33 | 91.59 | 106.86 | 120.59 | 124.51 | 124.51 |
| 7 | 110008 | 110008 Costa Marques | 77.37 | 95.26 | 95.26 | 95.79 | 92.11 | 102.63 | 106.32 | 99.47 | 109.19 | 83.78 | 92.43 | 92.97 |
| 8 | 110009 | 110009 Espigão D'Oeste | 91.30 | 83.44 | 83.44 | 83.44 | 84.08 | 94.27 | 97.66 | 95.33 | 92.29 | 2.78 | 90.36 | 86.94 |
| 9 | 110010 | 110010 Guajará-Mirim | 62.12 | 47.32 | 47.32 | 47.32 | 46.39 | 46.27 | 47.79 | 46.50 | 52.44 | 15.97 | 37.54 | 38.50 |
Last rows
| COD | Município | CV_BCG | CV_HEPatite_B | CV_HIB | CV_DPT | CV_Polio | CV_rota | CV_Pneumo | CV_MNCC | CV_SCR1 | CV_SCR2 | CV_VARICELA | CV_HEPATITE_A | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5560 | 522160 | 522160 Uruaçu | 93.58 | 86.23 | 86.23 | 86.23 | 80.94 | 89.62 | 93.77 | 88.30 | 93.71 | 40.76 | 65.33 | 81.33 |
| 5561 | 522170 | 522170 Uruana | 22.96 | 72.59 | 72.59 | 72.59 | 71.85 | 77.78 | 86.67 | 79.26 | 76.87 | 48.51 | 63.43 | 68.66 |
| 5562 | 522180 | 522180 Urutaí | 95.83 | 170.83 | 170.83 | 170.83 | 166.67 | 150.00 | 145.83 | 175.00 | 170.83 | 20.83 | 125.00 | 129.17 |
| 5563 | 522185 | 522185 Valparaíso de Goiás | 20.62 | 79.94 | 79.94 | 80.10 | 80.75 | 79.58 | 82.08 | 79.66 | 77.84 | 35.90 | 51.02 | 74.00 |
| 5564 | 522190 | 522190 Varjão | 37.93 | 131.03 | 131.03 | 131.03 | 131.03 | 124.14 | 127.59 | 117.24 | 111.11 | 122.22 | 203.70 | 155.56 |
| 5565 | 522200 | 522200 Vianópolis | 125.00 | 104.65 | 104.65 | 105.23 | 99.42 | 104.07 | 111.63 | 101.16 | 105.85 | 87.13 | 119.88 | 115.20 |
| 5566 | 522205 | 522205 Vicentinópolis | 9.35 | 56.12 | 56.12 | 56.12 | 57.55 | 53.96 | 54.68 | 47.48 | 72.46 | 13.77 | 34.06 | 26.81 |
| 5567 | 522220 | 522220 Vila Boa | 64.71 | 131.37 | 131.37 | 131.37 | 129.41 | 127.45 | 131.37 | 133.33 | 123.53 | 74.51 | 82.35 | 125.49 |
| 5568 | 522230 | 522230 Vila Propício | 28.30 | 67.92 | 67.92 | 67.92 | 67.92 | 60.38 | 67.92 | 62.26 | 56.60 | 33.96 | 43.40 | 52.83 |
| 5569 | 530010 | 530010 Brasília | 104.85 | 78.82 | 78.77 | 78.76 | 78.74 | 80.90 | 84.41 | 81.75 | 87.12 | 57.46 | 78.90 | 81.75 |